A Black-Box Approach to Query Cardinality Estimation
نویسندگان
چکیده
We present a “black-box” approach to estimating query cardinality that has no knowledge of query execution plans and data distribution, yet provides accurate estimates. It does so by grouping queries into syntactic families and learning the cardinality distribution of that group directly from points in a high-dimensional input space constructed from the query’s attributes, operators, function arguments, aggregates, and constants. We envision an increasing need for such an approach in applications in which query cardinality is required for resource optimization and decision-making at locations that are remote from the data sources. Our primary case study is the Open SkyQuery federation of Astronomy archives, which uses a scheduling and caching mechanism at the mediator for execution of federated queries at remote sources. Experiments using real workloads show that the black-box approach produces accurate estimates and is frugal in its use of space and in computation resources. Also, the black-box approach provides dramatic improvements in the performance of caching in Open SkyQuery.
منابع مشابه
Critical Path based Performance Models for Distributed Queries
Programming models such as MapReduce and DryadLINQ provide programmers with declarative abstractions (such as SQL like query languages) for writing data intensive computations. The models also provide runtime systems that can execute these queries on a large cluster of machines, while dealing with the vagaries of distribution such as messaging, failures and synchronization. However, this level ...
متن کاملAdaptive Cardinality Estimation
In this paper we address cardinality estimation problem which is an important subproblem in query optimization. Query optimization is a part of every relational DBMS responsible for finding the best way of the execution for the given query. These ways are called plans. The execution time of different plans may differ by several orders, so query optimizer has a great influence on the whole DBMS ...
متن کاملEfficient Generation of Query Optimizer Diagrams
Given a parameterized n-dimensional SQL query template and a choice of query optimizer, a plan diagram is a color-coded pictorial enumeration of the execution plan choices of the optimizer over the query parameter space. Similarly, we can define cost diagram and cardinality diagram as the pictorial enumerations of cost and cardinality estimations of the optimizer over the same space. These thre...
متن کاملA Signal Processing Approach to Estimate Underwater Network Cardinalities with Lower Complexity
An inspection of signal processing approach in order to estimate underwater network cardinalities is conducted in this research. A matter of key prominence for underwater network is its cardinality estimation as the number of active cardinalities varies several times due to numerous natural and artificial reasons due to harsh underwater circumstances. So, a proper estimation technique is mandat...
متن کاملExploring the Space of Black-box Attacks on Deep Neural Networks
Existing black-box attacks on deep neural networks (DNNs) so far have largely focused on transferability, where an adversarial instance generated for a locally trained model can “transfer” to attack other learning models. In this paper, we propose novel Gradient Estimation black-box attacks for adversaries with query access to the target model’s class probabilities, which do not rely on transfe...
متن کامل